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ABSTRACT 

The velocity distribution f(v) of nearby stars is estimated, via a maximum-likelihood 
algorithm, from the positions and tangential velocities of a kinematically unbiased sample 
of 14369 stars observed by the HIPPARCOS satellite. / shows rich structure in the radial 
and azimuthal motions, vr and v v , but not in the vertical velocity, v z : there are four 
prominent and many smaller maxima, many of which correspond to well known moving 
groups. While samples of early-type stars are dominated by these maxima, also up to 
about a quarter of red main-sequence stars are associated with them. These moving groups 
are responsible for the vertex deviation measured even for samples of late-type stars; they 
appear more frequently for ever redder samples; and as a whole they follow an asymmetric- 
drift relation, in the sense that those only present in red samples predominantly have large 
\vr\ and lag in v v w.r.t. the local standard of rest (LSR). The question arise, how these 
old moving groups got on their eccentric orbits? A plausible mechanism known from the 
solar system dynamics which is able to manage a shift in orbit space is sketched. This 
mechanism involves locking into an orbital resonance; in this respect is intriguing that 
Oort's constants, as derived from HIPPARCOS data, imply a frequency ratio between 
azimuthal and radial motion of exactly Q : k = 3 : 4. 

Apart from these moving groups, there is a smooth background distribution, akin to 
Schwarzschild's ellipsoidal model, with axis ratios <Jr : : a z ~ 1 : 0.6 : 0.35. The 
contours are aligned with the v r direction, but not w.r.t. the v v and v z axes: the mean 
v z increases for stars rotating faster than the LSR. This effect can be explained by the 
stellar warp of the Galactic disk. If this explanation is correct, the warp's inner edge must 
not be within the solar circle, while its pattern rotates with frequency ^ 13kms _1 kpc _1 
retrograde w.r.t. the stellar orbits. 



Subject headings: Galaxy: kinematics and dynamics - Galaxy: solar neighborhood - 
Galaxy: structure - Methods: numerical - Stars: kinematics 
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1. INTRODUCTION 

The dynamical state of a stellar system is completely 
described by its phase-space distribution function F(x, v). 
Knowledge of that function allows one to derive the un- 
derlying gravitational potential and hence the mass dis- 
tribution, since in equilibrium F must be constant on the 
orbits supported by the potential. Furthermore, one can 
try to understand F in terms of the possible formation 
history and evolution of the stellar system under study. 
Unfortunately, F, being a six-dimensional function, is 
impossible to measure for any remote galaxy; all one can 
hope is to measure a three-dimensional projection, and 
even this requires spectra to be taken at all positions 
on the galaxy. For the Milky Way the situation is com- 
pletely different. In a sense it is worse, since we cannot 
observe a clean and easily predictable projection. Rather 
we observe individual stars and are in danger of seeing 
the trees for the wood. However, if we forget individual 
stars and observe large samples, we can actually measure 
F(x,v) itself, at least in principle. The problem here is 
the shear amount of labour necessary to collect accurate 
measurements of stellar phase-space coordinates for large 
samples, in particular of faint stars. This has so far pre- 
vented this route from being used. Thus, we must either 
restrict ourselves to luminous and hence young stars, or 
to very nearby stars of all stellar types and ages. Since 
young stars are more likely to have not yet settled into 
equilibrium, interpretation of their phase-space distribu- 
tion is more complicated. The best we can currently hope 
to get for the old stellar population is the velocity distri- 
bution in the solar neighborhood f(v) = F(x=x@,v). 

Historically, there are two divergent approaches to 
f(v). One is mainly based on theoretical lines of ar- 
gument like the epicycle approximation and dynamical 
heating mechanisms. It manifests itself in Schwarzschild's 
(1908) ellipsoidal distribution, i.e. f(v) is assumed to be 
smooth, single-peaked, and well described by the mean 
velocity and dispersion in conjunction with Strdmberg's 
asymmetric drift relation (cf. Binney & Tremaine 1987, 
p. 198 and 202). The other approach to f(v) is more 
observational and mainly restricted to luminous stars on 
account of magnitude limits. It begins with Kapteyn's 
(1905) recognition of moving groups, which were later 
studied in more detail by Eggen (1965 and references 
therein, 1995, 1996). That is, f(v) for early-type stars 
appears to be dominated by several independent compo- 
nents. According to the standard interpretation, stars of 
a moving group have formed simultaneously in a small 
phase-space volume. The idea is that phase mixing and 
scattering processes have not yet completely washed out 
the initial correlation of the moving group, and we can 
still observe a stream of stars with similar velocities. The 
old stellar populations, on the other hand, should be 



completely mixed and obey a smooth distribution func- 
tion. This interpretation, however, is merely hypotheti- 
cal and by no means proven. Indeed, the results of this 
paper raise problems for this simple picture. 



ESA's astrometric satellite HIPPARCOS flESA 1997| ) 
provided us with positions and trigonometric parallaxes 
of unprecedented accuracy for tens of thousands of stars 
near the Sun. From these data we can, for the first time 
in history, extract large and homogeneous stellar samples 
with accurately known distances and, what is important, 
completely free of the kinematic biases that have plagued 
similar studies in the past. Since HIPPARCOS has ad- 
ditionally measured the absolute proper motions for the 
same stars, it offers us the opportunity to investigate in 
some detail the nature of the velocity distribution in the 
solar neighborhood not only for early-type stars but also 
for the old stellar population of the Galactic disk. In 
a preceding paper (Dehnen & Binney 1998b, hereafter 
paper I), we have used a kinematically unbiased subsam- 
ple of the HIPPARCOS Catalogue to infer, as a function 
of stellar color, the first two moments, mean and disper- 
sion, of f(v) for nearby stars. This paper goes beyond 
these moments to infer the velocity distribution itself 
from the HIPPARCOS data. 

If the HIPPARCOS mission were complemented by 
a program to measure the radial velocities of the same 
stars, we could obtain to high accuracy their space veloc- 
ities v, and hence directly measure the distribution func- 
tion for these stars. Unfortunately, however, the radial 
velocities for many stars in the HIPPARCOS Catalogue 
are not (yet) publicly known. From the existing litera- 
ture one can extract radial velocities only for a subset of 
HIPPARCOS stars that is heavily kinematically biased 
in the sense that it predominantly contains high-proper- 
motion stars (Binney et al. 1997). Hence, in order to 
make full use of the HIPPARCOS Catalogue, or of simi- 
lar future catalogs, for studies of stellar kinematics, one 
cannot necessarily rely on radial velocities, but must be 
able to work with the positions and tangential velocities 
alone. For each position on the sky, the tangential ve- 
locity is a certain combination of the components of the 
space velocity. It is immediately clear that we can only 
proceed if we link together the tangential velocities mea- 
sured for stars in different directions from the Sun. In 
order for such a method to be valid, one must make the 
basic assumption that f(v) is independent of the posi- 
tion on the sky, a condition that should be satisfied for 
sufficiently nearby stars. 

In Section 2, 1 show that, under this basic assumption, 
knowing the distributions of tangential velocities for a re- 
gion of the celestial sphere that cuts all great circles gives 
complete knowledge of the full distribution of space ve- 
locities. The problem is formally equivalent to that of 
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classical tomography. However, there is a fundamental 
difference, namely that we are unable to measure the dis- 
tributions of tangential velocities to arbitrary accuracy, 
rather we only know the tangential velocities for a few 
thousand stars distributed over the sky. Consequently, 
we cannot use the tools developed for tomography. In- 
stead, in Section 3, a maximum-likelihood technique is 
presented and tested that enables one to estimate f(v) 
for a set of stars from their positions, parallaxes, and 
proper motions. 

The reader not interested in these techniques and nu- 
merical details but mainly in the resulting f(v) may skip 
Sections 2 and 3 and directly go to Section 4, where the 
algorithm is applied to the kinematically unbiased sub- 
sample of HIPPARCOS stars that we have already used 
in paper I. The results and their implications are dis- 
cussed in Sections 5 and 6, while Section 7 sums up. 

2. THE PROJECTION OF SPACE VELOCI- 
TIES 

2.1. The Projection Equations 

Let v be the velocity of a star w.r.t. the Sun in a 
Cartesian coordinate system with v x , v y , and v z denot- 
ing the motions towards the Galactic center, in the direc- 
tion of Galactic rotation, and towards the north Galactic 
pole, respectively. While this coordinate system is a good 
choice in dynamical studies, the natural coordinates for 
observations are v r = r, Vg. = rpicosb, and v b = r/jj,. 
Here i, b, and r denote, respectively, Galactic longitude, 
latitude, and distance from the Sun. pi and p\, are the 
proper motions corrected for the effects of Galactic rota- 
tion 

w = M (° bs) - A cos(2^) - B 

Pb = Mt>° bb ' + A sin(2£) cos b sin 6, 

with Oort's constants A and B; I use Feast & Whitelock's 
(1997) values (in kms" 1 kpc -1 ) A = 14.82 and B = 
-12.37 obtained from HIPPARCOS Cepheids, but the 
results are insensitive to the precise values. The two 
coordinate systems are related by a rotation 



(v r ,V£, v b ) T — R v, v — R T (v r ,vt, v b ) J 



with rotation matrix 



R 



cos b cos £ 
— s'm£ 
— sin b cos i 



cos b sin £ sin b 
cos£ 
- sin b sin £ cos b 



(2) 



(3) 



For our purposes it is useful to consider the two-dimen- 
sional velocity vector q = (vg,Vb), which can be evalu- 
ated from HIPPARCOS's measurements. It is related to 
v by the projection equation 

q = S • v (4) 



where the 2x3 matrix S is given by the second and third 
row of R. In the (v Xl v y ,v z ) frame, the corresponding 
tangential velocity p = S T ■ q is related to v by the 
projection 



p = v — (r ■ v)r = A ■ v, 



(5) 



where f, given by the first row of R, is the unit vector 
in the direction (£, b) . Equation (|^) , which is identical 
to Equation (3) of paper I, defines the projection matrix 
A = I — r r (paper I, equation 4) and is equivalent 
to (Q). Clearly, matrix A is singular, so (||) cannot be 
inverted; one needs the radial velocity to obtain v. How- 
ever, taking the average of @ over a region of the sky, 
which can in principle be as small as a line segment, 
results in the non-singular equation 



<P> = (A) • (v) , 



(6) 



which can be solved for the average space velocity (v), 
as we have done in paper I. Similarly, one can take the 
average of the nth outer product of (|B|) with itself 



(pi 



) p) = (A i 



(7) 



and obtain the moments of order n for the space veloci- 
ties; in paper I we have done this for the velocity disper- 
sion tensor <r 2 = (v ® v) — (v) (g) (v). 

The projection onto the celestial sphere can also be 
formulated in terms of the velocity distribution function 
f(v). To this end let p(q\f) be the probability distribu- 
tion of tangential velocities in direction f, then 



P{q\r) = J dv r f (v = S T - q + v r 



2.2. The Fourier Slice Theorem 



(8) 



At this point is very fruitful to invoke the Fourier slice 
theorem. This follows directly by Fourier transforming 
@ and states that the (two-dimensional) Fourier trans- 
form g(n\r) of p{q\r) w.r.t. q is related to the Fourier 
transform T(k) of f(v) by 



e(«|f) 



2iTF(k = S 1 ■ k), 



(9) 



i.e. g is given by T in the slice normal to r. Thus, one 
will have full knowledge of T{k), and hence of f(v), if 
and only if one knows p(q\r) for a set of f such that the 
corresponding set of slices covers the full k space. This 
immediately leads to the following statement: 

"The underlying f(v) is uniquely determined by its pro- 
jection p(q\r) if and only if the latter is known for a 
region of the celestial sphere that intersects with every 
great circle." 
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The smallest such region is any half great circle. Hence, 
knowing p(q\r) for more than that yields redundant in- 
formation that, in principle, can be used to check our 
basic assumption that the inferred f(v) does not depend 
on direction from the Sun. 

The attentive reader might feel an apparent contra- 
diction between the statement above and the result from 



Section 2.1 that the moments of f(v) can be derived al- 
ready if one has precise knowledge of p(q\r) on a small 
region of the sky. The resolution is as follows. Given one 
knows p(q\r) on a region that does not satisfy the above 
criterion, then there will be a region in k space where one 
has no knowledge of J-(k). Hence, any distribution f'(v) 
whose Fourier transform T'{k) is non-zero only in this 
zone of ignorance is undetectable in the data (it projects 
to zero) . Such a function cannot possibly be analytic (in 
the mathematical sense), i.e. the Taylor series for T'(k) 
does not converge globally. Since 



d?vf{v)v l x v™v n z 



d 



T 



d{ik v jd(ik Vy ) m d{ik Vz y 



k=o 



we can only infer T{k) from the moments of f(v) if we 
assume that T is analytic. The Fourier slice theorem 
makes no assumptions about the nature of /, and, as a 
consequence, stronger observational constraints are re- 
quired. 

2.3. The Marginal Distributions 

In many cases one is only interested in the distribu- 
tion of velocities in the plane fii(v x ,v y ) = J dv z f(v). A 
projection equation for /|| can be obtained by integrating 
(0) over all v&: 



p(v e \£) = / dv\\ f\\ 



v x = cos£ vu — sirji vi 
v y = sin£u|| + cosive 



(10) 



where p(ve\() is the distribution of vi for stars in direction 
£. That is, we could derive f\\ from the vt alone, without 
need of Vb, and still use data from stars all over the sky. 
However, as experiments showed, inverting (0) and then 
projecting gives better results than inverting ([l^). This 
is not surprising, since the star's latitudinal velocities 
contain valuable information about (v x ,v y ). 

For the recovery of the marginal distribution of verti- 
cal velocities, f±(v z ) = ff dv x dv y f(v), directly from the 
data we can only use stars in the Galactic plane. This 
follows from the Fourier slice theorem: the only slices 
that give information on the k z axis are those that con- 
tain it, i.e. p(vb\b—Q) = f±(vb). 

2.4. Relation to Classical Tomography 

There is a close connection between our problem and 
tomography as in medical diagnostic. In that case, the 



unknown distribution is defined in configuration rather 
than in velocity space, but the projection equation is 
identical to (|J) or ([l0|). There is, however, a funda- 
mental difference to our problem: astronomy relies on 
observations - most other sciences rely on experiments. 
In tomography one is able to measure the projected dis- 
tribution to high accuracy for many directions densely 
covering half a great circle. A viable way to recover the 
underlying distribution from the density of J> 10 11 gath- 
ered (or absorbed) photons is then the direct inversion 
via the Fourier slice theorem, also known as inverse radon 
transform. By contrast, we only know, typically, ~ 10 4 
points that nature has randomly chosen for us from the 
underlying f(v). The great gulf between 10 11 and 10 4 
obliges us to analyze our data differently. 

3. THE VELOCITY DISTRIBUTION AS 
MAXIMUM-PEN ALIZED-LIKELIHOOD- 
ESTIMATE 

This section deals with the algorithmic and numerical 
aspects of estimating f(v) from the tangential velocities 
and directions of N stars. In order to avoid confusion, I 
will use, in this section only, the symbol / for any model 
or estimate of the true and unknown distribution /o that 
underlies the data. 

3.1. Formulation 

As the discussion in the last subsection showed, it is 
rather hopeless to infer fo(v) directly via the Fourier 
slice theorem. Rather, following Bayesian statistics, the 
route to estimate fo(v) is to maximize the log- likelihood 
of a modelQ f(v) 



N 

C(f)=N- 1 £ lnP(q k \f k ,f), 



(11) 



k=l 



1 In the literature one can also find convergent point methods used 
for similar purposes. These methods rely on geometrical arguments 
and stem from the times when more rigorous treatment (like max- 
imizing the log-likelihood) was impractical on technical grounds. 
Nonetheless, Chereul, Creze & Bienayme (1997) have used such a 
technique for the estimation of fo(v) for A stars from HIPPAR- 
COS data. These authors define the convergent points for a pair 
of stars as the velocities on their lines v = p + v r r (where the 
unknown v r is treated like a variable) that minimize the distance 
between them. If this distance is smaller than some threshold, 
the two velocities are remembered. After considering all possible 
pairs of stars, this results in a list of velocities, that can be directly 
translated into a histogram to estimate fo(v). Testing this method 
for anisotropic fo(v), I found that the outer contours are always 
too round and the estimated velocity dispersions significantly too 
high. This failure can be attributed to accidental convergent points 
at large |u|. In the limit of infinite many stars, this form of the 
convergent-point method does not converge to /o — in contrast to 
a maximum-likelihood technique. 
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where 



P(Qk\rkJ) = J dv r f(v =p k + v r r k ) (12) 
= J d 3 v f(v) 6(q k - S fc • v) (13) 



is the probability to observe the tangential velocity p k = 
Sfc • q k for a star seen in direction r k and with velocity 
drawn from /(i>)0- The functional C(f), however, has no 
maximum, it is unbounded, because one can arrange N 
(5-functions, one on each line p k + v r r k in v space, to get 
C = oo. In order to obtain a well defined procedure, one 
needs to regularize C(f). This is conveniently done by 
adding a penalty functional, which in Bayesian statistic 
might be interpreted as the logarithm of the prior. That 



is, one maximizes 



Q a (/) = £(/)- ±aS{f), 



(14) 



where the penalty functional S(f) measures the rough- 
ness of f(v), whereas a > is a Lagrange multiplier, 
usually referred to as smoothing parameter. The func- 
tion that maximizes Q a (f) subject to the constraints 



< f(v) and 
l=M(f)= [ d 3 vf(v) 



(15) 
(16) 



is the maximum-penalized-likelihood-estimate (MPLE) 
and will be denoted f a (v). The optimal choice of a is 
a problem on its own and the subject of the next sub- 
section. A common choice for the penalty functional is 
Jd 3 u(V 2 ln/) 2 . I will use a slightly different form: 



S(J) = J d 3 v(V 2 \nf) 2 



with 



Gx dv x ' ° v dv v 



dv. 



(17) 



(18) 



where crj is a measure for the width of f(v) in the ith 
dimension, for instance, the velocity dispersion estimated 
via the methods of paper I. With this choice of <S(/), the 
smoothing parameter a is independent of the width of 
So- 



2 The measurement uncertainties of the HIPPARCOS data can be 
incorporated by replacing the 5-function in ( |l3| ) with a Gaussian in 
the observables. However, as became clear in paper I, the dominant 
source of errors is sampling noise due to the small number of stars 
in the sample, and accounting for the uncertainties would hardly 
improve the results. 



3.2. The Optimal Smoothing 

The parameter a determines the amount of smooth- 
ing. Clearly, there exists an optimum value, < a op t < 
oo, since neither a = (no smoothing leading to the 
^-function catastrophe) nor a = oo (ignoring the data) 
gives an useful result. One usually finds a op t by requir- 
ing that it minimizes the difference, -D(/a,/o)j between 
f a and /o- There are various ways to measure this differ- 
ence, most commonly used are (cf. Silverman 1986) the 
integrated square error (ISE) 



D(f a ,h) 



d 3 v (f a - Jo) 2 



(19) 



or the Kullback-Leiber information-distance (KLD) 
D(f a Jo) = - [d 3 v f Hh/U). (20) 



Of course, since /o is unknown, the ISE (or KLD) cannot 
be evaluated. We can, however, estimate fo and simu- 
late data and analysis, in order to estimate D(f a ,fo). 
Since f a not only depends on /o and a, but also on the 
particular random realization of the N data, D(f ai fo) 
is a random variable. Minimizing a random variable 
make no sense, and one usually considers the mean 
over the data realizations, giving the mean integrated 
square error (MISE) and the mean Kullback-Leiber dis- 
tance (MKLD), respectively. These can be estimated and 
hence minimized, even though fo is unknown, a process 
called cross-validation. For the present problem I use the 
following procedure. Starting with some initial guess, ao, 
for a op t one approximates fo as f ao and measures the 
MISE (or MKLD) that results for a given a. (This is 
done by drawing M samples of N pseudo-data from / , 
evaluating for each of them f a , measuring D(f a ,f aQ ), 
and taking the mean over the M samples.) The value 
for a that minimizes the MISE (or MKLD) is then used 
as the next iterate for a opt . If one starts off with a good 
guess ('by eye'), this procedure converges to reasonable 
accuracy within a few iterations. 

Since the data are too poor to infer the full three- 
dimensional structure of f(v) (N of ~ 10 4 corresponds 
to ~ 30 stars per dimension), I will concentrate on the 
projections onto the three principal coordinate planes. In 
the v x v y plane, f{v) is most extended (as judged from 
the dispersions obtained in paper I), and consequently I 
estimated the MISE and MKLD only for the projection 
onto this plane, i.e. instead of the ISE (eq. p~|) I compute 



dv x dv v 



dv z 



f-fo 



(21) 



and analogously for the KLD. 
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3.3. Numerics 

3.3.1. The MPLE as Unconstrained Extremum 

Our problem is, to find the maximum of Q a (f) sub- 
ject to the constraints ( |f5| ) and (|l6|). Constraining the 
possible solutions in such a way is awkward, in partic- 
ular for numerical extremization. Fortunately, the non- 
negativity and normalization conditions can be met with- 
out imposing any constraint. According to Silverman 
(1982) the condition Af(f) — 1 can be satisfied by max- 
imizing instead of Q a {f) the functional 



QM) = QM)-M(f). 



(22) 



Silverman has shown that the MPLE, f a , uncondition- 
ally maximizes QM)- To see this, let /* = f /M(f) 
be the normalized counterpart of f(v). Since S only in- 
volves derivatives of In/, S(f) — S(f*) and elementary 
manipulations yield 



QM*) = QM) +M(J) - ^Af(f) - 1. 



(23) 



Since t — lnt — 1 > for all t > with equality only if 
4=1, QM*) > QM) with equality only iiAf(f) = 1. 
Thus, for the maximum of Q a , AT(f) = 1; but subject 
to Af(f) = 1, Q a {f) and Q a {f) — 1 are identical and the 
proof is complete.^] 

The non-negativity condition (^) can be trivially met 
by defining 

f(v) = exp [cf>(v)] (25) 
and considering Q a a functional of the unbounded <fi(v). 

3.3.2. Numerical Representation 

The easiest way to represent (j)(v) numerically is by 
pixelization on a Cartesian grid with L = L x x L y x L z 
cells of size h x x h y x h z : 



(v) = Y,<j> l Wi(v), k = 0,...,Li-l, 



(26) 



with the window functions 



Wi(v) = i ( hxh y hz ) 1 if Vi\ v i- l ihi-yi\ < ^ 
|0 otherwise, 



3 Note that, even though Silverman's original proof relies on the 
particular form of the functional to be maximized, one can for any 
functional A(f) (/ > 0) eliminate the normalization constraint 
using his method: the (unconditional) maximum of 



Mi) = A(f/Af[f]) + In A/" (/) - Af(f) 



(24) 



where y is the center of cell 1 = and determines the 
position of the cells in v space. Inserting ( p6| ) into (|lT 
gives 



P(q k \r k J)=Y,^ l K(k\l) 



where 



K(k\l) = J dv r Wi (v = p k + v r r k ) , 



(28) 



(29) 



which simply is (hxhyhM 1 times the length in v r of the 
segment of the line v = p k + v r r k that lies in cell I. We 
can estimate 



V 2 hxf(v t ) ~^(f) n S njJ , 

n 

where 



— nl 



i—x,y,z 



(30) 



(31) 



with Bi denoting the unit vector in the zth direction. 
Thus, the numerical approximation to the functional 
QM) is 



r i 2 



(32) 



with (</>;) = (f), a i-dimensional vector. 

3.3.3. Maximization of Q a 

At the maximum of Q a , its gradient vanishes giving 
the relation (with equation E2) 



hi 



dQ a 



(33) 



A common way to obtain the solution of such a fix-point 
equation is to start with some initial guess, insert it in the 
right-hand side, to obtain an improved estimate on the 
left-hand side. If the function on the right-hand side of 
( |33| ) is well defined (i.e. the derivative is always positive) 
and contracting, then it follows from Banach's fix-point 
theorem that an iteration of this step will eventually con- 
verge to the MPLE. If Q a = £, these conditions are sat- 
isfied and the resulting algorithm is known as Richardson 
(1972) - Lucy (1974) algorithm!]. However, even when it 



maximizes 



A(f) subject to M(f) = 1. 



4 The Richardson-Lucy algorithm can be generalized to maximize a 
more general functional, A(f), subject to J\T(f) = 1. Let A(fi) the 
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works, the convergence will, in general, be no faster than 
linear. 

Therefore, the maximization is more efficiently done 
by the conjugate-gradient algorithm (cf. Press et al. 
f992). This technique too obtains the vector <fii of the 
MPLE, hereafter denoted by 4> t , iteratively (with O(L) 
operations per iteration). This time, however, the con- 
vergence will be quadratic, and the number of iterations 
theoretically required is about L, so in total 0(L 2 ) op- 
erations are required. For L 10 5-6 this implies that 
each solution needs a considerable amount of computer 
time. 

Fortunately, experiments showed that already after 
much less iterations <pi is very close to 4>i ■ For a run with 
L = 430 080 and N — 1294 pseudo-data drawn from a 
model, the sum of three Gaussians, Figure [l] shows var- 
ious quantities as function of the number of iterations 
completed. Already after ~ 10 4 w 0.02L iterations, the 
modulus of the gradient dQ a /d<f> has decreased by four 
orders of magnitude (the iterations were actually stopped 
when it had decreased by 10~ 7 ); the same holds for both 
the supremum of the modulus and the RMS-value of the 
increment Aipi at each iteration. For a rigorous estima- 
tion of the error of the current (f>i as compared to 4> l , one 
can extrapolate supj{| Acf>i |} into the future in order to 
estimate an upper limit for supj{|(/>z — <j>i\}. 

Panels (b) and (c) of Figure |l| show the run of the 
ISE and KLD between the current estimate for f(v) and 
the model underlying the pseudo-data. It is evident that 
after ~ 2000 w 0.005L iterations further maximization 
of Q a (f) does not change the quality of the current esti- 
mate, even though the convergence is still sub-quadratic 
(as judged from the run of supj{|A(fo|}). Clearly, it 
makes no sense to iterate far beyond this point. In prac- 
tice, I iterate until \dQ a /d(f)\ falls below some threshold, 
e.g. 10 -6 , but perform at least a minimum number of 
iterations. 

The number of operations can be considerably re- 
duced by the multi-grid approach. A solution obtained 
on a coarser grid is cheaper to obtain but contains al- 
ready a wealth of information. This can be exploited 
by starting on a really coarse grid, maximizing Q a on 



discretized form of A(f), then the increment A/; for this algorithm 
can be obtained by requiring dA /d In f, = p (see equation yA for 
the definition of A), which gives (Lucy 1994) 




(34) 



However, for the algorithm to work one must have dA/dfi > 
everywhere, which places a strong restriction on the possible ap- 
plications; for instance, for A = Q a these conditions are generally 
not satisfied. 



this grid, transforming to a finer grid, and so on. I will 
use a sequence of four or five grids, each created from 
its predecessor by dividing the cells into eight daughter 
cells; cj> is linearly interpolated to transform to the finer 
grid. Typically, on each of the grids 200 iterations are 
performed, except for the finest one, where the number 
might be higher. 

4. APPLICATION TO HIPPARCOS DATA 
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Fig. 1. — Conjugate-gradient maximization of Q a for a run with 
L = 96 X 80 X 56 and TV = 1294 data drawn at random from a 
model for f(v). The panels show as a function of the number of 
iterations completed: (a) gradient of Qa, supremum and RMS value 
of the increment; (b) integrated square error (|uj); (c) Kullback- 
Leiber information-distance (pfj|); and (d) Q a itself. 
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4.1. The Subsamples 

Even though the HIPPARCOS Catalogue contains 
many of the nearby stars, its completeness varies with 
direction and stellar type and some care must be taken, 
in order to avoid kinematic biases, which are present in 
the full Catalogue and would render this study useless. 
In paper I, we have extracted from the HIPPARCOS Cat- 
alogue a kinematically unbiased sample of 11865 sin- 
gle main-sequence stars with parallaxes accurate to 10 
percent or better, hereafter Sample 1. The kinematics 
inferred from this sample in paper I depend on color. 
For B — V 0.61 mag, the dispersion velocities increase 
for ever redder color bins, because, for main-sequence 
stars, the mean color correlates with the mean age, and 
scattering processes increase the random motions with 
time. Above B — V ~ 0.61 mag the dispersion is con- 
stant. This transition is called Parenago's discontinuity 
and arises because at B — V > 0.61 mag subsamples of 
main-sequence stars have the same mean age. 

To take this dependence of kinematics on color into ac- 
count, I subdivide Sample 1 into four color bins, labeled 
'Bl' to 'B4'. The first, Bl, contains stars bluer than 
B — V — Omag. These stars do not follow Stromberg's 
asymmetric-drift relation (the linear dependence of the 
mean v y on the velocity dispersion squared) defined by 
the other samples (Figure 4 of paper I) . The fourth color 
bin, B4, consists of all stars of Sample 1 red-ward of 
B — V — 0.6 mag and should be dominated by old-disk 
stars. The remaining two bins have intermediate col- 
ors. In addition to these bins of main-sequence stars, I 
consider a fifth set, labeled 'GI', of stars, mainly giants, 
which are in the kinematically unbiased sample derived 
in paper I but excluded from Sample 1 because they are 
off the main-sequence. Finally, the union of all these sub- 
sets, labeled 'AL', is analyzed as a whole. So, in total 
there are six kinematically unbiased sets of stars, five of 
which do not overlap; more details are given in Table |. 

The full set, AL, is magnitude limited and one should 
bear in mind that it does not give a fair representation of 
the color and velocity distribution of typical stars in the 
solar neighborhood, but is heavily biased towards lumi- 
nous stars. Subsets Bl to B4, however, being restricted 
to main-sequence stars of a narrow color-range or be- 
yond Parenago's discontinuity, are much less biased in 
this sense. Finally, for the interpretation of the results 
it is useful to note that the upper limit in B — V for the 
samples Bl to B3 of main-sequence stars places an up- 
per limit on the stellar age; the last column of Table [l] 
contains the values obtained from the stellar evolution- 
ary models by Bressan et al. (1993) for a metallicity of 
Z = 0.02. 

In Figure ||, the positions on the sky of the stars in 



the five distinct subsamples Bl to B4 and GI are dis- 
played. Gould's Belt can be clearly identified in Bl, but 
the remaining subsamples do not show obvious cluster- 
ing (even the Hyades at i ~ —175°, b w —24° are hardly 
detectable). However, the distributions are not uniform, 
there are more stars near the poles of the ecliptic. The 
reason is that at these poles the absolute accuracies of 
the parallaxes measured by HIPPARCOS are highest, 
so more stars survive our selection on relative distance 
errors. Another enhancement is clearly visible in the 
subsample B4 around the southern celestial pole (near 
the southern pole of the ecliptic). This is because, as 
outlined in paper I, Sample 1 contains stars from HIP- 
PARCOS proposal 018, which is restricted to declina- 
tions south of —28°. 

4.2. The Distribution of Space Velocities 

For each of the six sets, f(v) was estimated as de- 
scribed in Section || with a finest grid of 96 3 cells of 
size (velocities are always given in kms -1 throughout 
this section) 3 x 2.5 x 1.8 covering the box [—154, 134] x 
[-160, 80] x [-93.4, 79.4]. Stars whose K(k\l) contributes 
to less than 96 cells are likely to originate from veloci- 
ties outside this box and have been excluded from the 
analysis. 

Because I have split the full sample into distinct sub- 
samples, optimizing the smoothing parameter (via the 
cross-validation technique of Section [T^) for each sub- 
sample independently would erase information that be- 
comes significant only when more than one subsample 
is considered simultaneously: features that are consis- 
tent between two or more distinct subsamples are sig- 
nificant even if optimal smoothing would suppress these 
features in each of the subsamples. Therefore, I em- 
ployed the cross-validation technique only to optimize 
the smoothing for the full sample (AL). The result- 
ing values a = 3.42 x 10~ 10 and a = (30.6,21.3,14.6) 
(obtained as the velocity dispersion computed via the 
method of paper I) minimize the MISE and MKLD at 
(2.13 ± 0.09) x 10" 6 and 0.0071 ± 0.0003, respectively. 



Table 1: The subsamples analyzed 



name 


(B-VU 


in, max 


iVtot 




7~max 


Bl 




0.0 


524 


524 


4 x 10 s 


B2 


0.0 


0.4 


3201 


3199 


2 x 10 9 


B3 


0.4 


0.6 


4596 


4582 


8 x 10 9 


B4 


0.6 




3544 


3527 




GI 






2504 


2491 




AL 






14369 


14323 





Color limits (in mag), total number iVtot of stars, number -/V usct j 
of stars used, and maximum stellar age (in years), for the color 
bins Bl to B4 of main-sequence stars, the giant sample GI and the 
union of these, AL. (see text). 
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Fig. 2. — The data: distribution in galactic coordinates of the five distinct sets Bl to B4 and GI. The bottom right frame shows lines 
of constant I (from 180° at the left to —180° at the right, spacing of 60°) and b (from —90° at the bottom to 90° at the top, spacing of 
30°), the triangles mark the poles of the ecliptic. 



For the subsets, except Bl, I fixed a and <x at these val- 
ues (which means that a is up to about four times smaller 
than the optimal value for each subset). Subset Bl has 
much fewer stars in it than the other subsets and tak- 
ing these values for the smoothing parameters resulted 
in a greatly under-smoothed estimate; I therefore used 
a = 1.78 x 1(T 9 for subset Bl. 

As a consistency check, I evaluated the first and 
second velocity moments from the inferred f(v); they 
agreed well with the moments estimated via the method 
of paper I. To test whether contamination with known 
clusters or associations is a serious problem, I applied 
the algorithm to subsamples with the region on the sky 
around the Hyades cluster and the region dominated by 
Gould's Belt (for Bl) being excluded. The results were 
similar to that obtained without excluding these stars. 

For the six sets of Table |], Figures |], and || show 
the projections of the estimated f(v) onto the v x v y , v x v z , 
and v y v z plane, respectively. 

The most conspicuous characteristic of the inferred 
velocity distributions are several strong maxima and 
many minor wiggles most apparent in the planar mo- 
tions (Fig. H). These features are less clear but still sig- 
nificant in the sample AL, for which optimal smoothing 
has been used. Almost all of these major and minor 
maxima and even some of the wiggles can be identified 
with one of Eggen's (1965, 1995, 1996) moving groups. 
Table || lists the positions of 14 such features in Fig- 
ures H to H together with the name of an associated 
moving group, where one could be found in the litera- 
ture. (Some of the associations might well be wrong, for 
instance, Eggen gives an age of 10 Gyr for the Arcturus 
group, much more than any star possibly in Bl.) The 



most prominent of these features (i.e. the first 4-5 in that 
table) have also been found by other groups from analy- 
sis of HIPPARCOS data, e.g. Figueras et al. (1997) using 
HIPPARCOS data in conjunction with radial velocities 
for young B5-F5 stars, and Chereul et al. (1997) using 
a convergent-point method (see footnote 1) for 3000 A 
stars^. The distribution of these maxima is skewed in the 
v x v y plane (but not in the other two principal planes), 
which in turn is responsible for the tilt in the velocity 



5 The latter authors even claim substructure on a scale of a few 
kms -1 , which I must call into question in the light of tests I have 
made: on small scales noise amplification inevitable creates spuri- 
ous structures - a phenomenon characteristic for inverse problems. 



Table 2: Features in the inferred velocity distributions 



No. 




Vy 


v z 


Bl B2 B3 B4 GI 


moving group 


1 


-12 


-22 


-7 


• • • 


• o 


Pleiades 


2 


-40 


-20 







• • 


Hyades 


3 


9 


3 




• • • 


• • 


Sirius & UMa 


4 


-10 


-5 


-8 


• • • 


• 


Coma Berenices 


5 


-25 


-10 


-15 




• o 


NGC1901 


6 


20 


-20 






• • 




7 


15 


-60 






• o 


HR1614 


8 


-40 


-50 




• • 


• • 




9 


-25 


-50 




o 


• • 




10 





-110 




• o 


o 


Arcturus 


11 


-70 


-10 




o 


o 




12 


-70 


-50 




o 


o • 




13 


50 







o 


• o 




14 


50 


-25 




o 


• o 





Velocities are approximate (v x is —U in Eggen's papers); if v z is 
missing it cannot be determined. • and o denote, respectively, 
clear or vague visibility in the corresponding subsample. The list 
of associated moving groups is presumably incomplete. 
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Fig. 3. — The distributions in Vx (towards the Galactic center) and v y (in direction of Galactic rotation): projection of f(v) obtained 
as MPLE for the sets listed in Table hi. Gray scales are linear and the contours contain, from inside outwards, 2, 6, 12, 21, 33, 50, 68, 80, 
90, 95, 99, and 99.9 percent of all stars, i.e. half the stars are within the innermost dark contour. The origin is at the solar velocity, while 
the velocity derived for the LSR in paper I is indicated by a triangle. Note that the smoothing is optimal for the full sample (AL) only, 
while the results for the subsets are under-smoothed. However, since the subsets are distinct, any feature common to more than one of 
them is likely to be real. 



ellipsoid (vertex deviation) derived in paper I. considers ever redder and hence on average older main- 

As one moves through the sequence Bl to B4, i.e. sequence stars, the following can be noticed. 




Fig. 4. — The distributions in v x (towards the Galactic center) and v z (towards the NGP). See Fig. ^for the contour levels, etc. 



(1) The relative number of stars in the maxima dimin- 
ishes. While roughly half of the stars in subsamples 
Bl and B2 are associated with one of the features, 
in B4 over ^3/4 cent are in a smooth background 
distribution. 

(2) The number of features increases. This means that 
some of the moving groups are old and enter the 
red subsamples only, whereas young clusters have 
stars of all colors. 

(3) Moving groups that are present only in the red sub- 
samples are predominantly at large negative az- 
imuthal velocities. Thus there is a correlation in 
the sense that the older a moving group is, the 
smaller is its (local) rotational velocity around the 



Galaxy. Similarly, most features at small or posi- 
tive v y decrease in amplitude as one moves to red- 
der samples. 

(4) The width of the distribution of moving groups 
clearly increases in (v x ,v v ) but much less so in v z . 
Together with (3) this means that the distribution 
of moving groups obeys an asymmetric drift rela- 
tion. 

(5) The width of individual maxima increases (at least 
for the four most prominent ones). 

(6) From B2 to B4, the extent of the outermost two 
contours (containing 99 and 99.9 per cent of the 
stars) increases by a factor of about two in v x 
and v y and about three in v z , while the center is 
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-40 - 
-60 - 




B2 





-50 
v y [km s""] 



50 
v y [km a" 1 ] 



Fig. 5. — The distributions in v y (in direction of Galactic rotation) and v z (towards the NGP). See Fig. |s| for the contour levels, etc. 



shifted to more negative v y , 
metric drift. 



reflecting the asym- 



(7) From B2 to B4, the axis ratio of the outer con- 
tours changes from about 1:0.6:0.35 to 1:0.6:0.5. 
A similar change, however, is also visible in each 
individual subsample when one moves from inside 
(small v) outwards. 

With respect to all these points, subsample GI of non- 
main-sequence stars most closely resembles sample B4 of 
main-sequence stars redder than B — V = 0.6 mag. 

It should be noted that, in contrast to the projections 
onto the v x v y plane, most features apparent in v z seem 
not be real (they do not occur in more than one distinct 
subsample). In fact, they all disappear when optimal 



smoothing is applied (not shown) as is the case for sample 
AL, with the remarkable exception of the double peaked 
f(v z ) at {v x ,v y ) « (-30,-20) in B4. 

5. MOVING GROUPS AND 
STELLAR STREAMS 

About half of the stars in the blue color bins Bl and 
B2 (with B — V < 0.4 mag) are associated with moving 
groups, but the same holds for a considerable fraction 
(up to ~ 25 per cent) of main-sequence stars in the red 
color bins B3 and B4, even red-ward of Parenago's dis- 
continuity at B — V s» 0.61 mag (paper I), where no disk 
star has yet left the main sequence. This means that 
many of the moving groups, in particular the less promi- 
nent ones, must be made of stars older than 2-8 Gyr, the 
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ages of the oldest stars possibly in samples B2 and B3, 
respectively. 

5.1. The Shape of Moving Groups 
in Velocity Space 

According to the standard picture of these phenom- 
ena, stars form in clusters each of which is initially con- 
fined to a small volume in phase-space. Over time these 
dissolve, such that the old stellar population obeys a 
smooth distribution both in configuration and velocity 
space. While the mechanisms driving the evolution and 
dissolution of the initially bound cluster have been stud- 



ied in some detail (Tcrlevich 1987), little is known about 
the subsequent evolution. Once the cluster becomes un- 
bound, its stars will have slightly different orbits and 
hence different frequencies, such that they will phase- 
mix. A cluster with initial velocity dispersion (To will be 
dispersed over all azimuths in a time of lnRj gq (~ 5 Gyr 
for R = Rq and er ~ lOkms -1 ). Since, at any fixed az- 
imuth one expects only stars with identical azimuthal fre- 
quency, i.e. identical angular momentum, such a stellar 
stream is expected to manifest itself as a clump in f(v) 
that is narrow in v y but has size ;> oo in (v x , v z ) ( Woolley 
196l| )p|. Although measurement uncertainties and noise 
tend to sphericalize these clumps, it is somewhat sur- 
prising, that only a few of the features in Figure || are 
elongated in this sense. However, one also expects that 
scattering with perturbers, like giant molecular clouds 
or spiral arms, will randomly change the stellar orbits 
and hence spread the stream deleting the correlations in 
velocities. On the other hand, as has long been known, 
the moving groups, except the very young ones, are al- 
most indistinguishable in v z . Presumably, this can be 
attributed to phase-mixing which is more efficient than 
for the horizontal motions, because, for any realistic disk- 
profile, the vertical frequency is a strong function of ver- 
tical energy, and the dynamical time is shorter by about 



6 Of course, in reality the stars of my sample are not exactly at the 
solar position but have a RMS deviation of d ss 40 pc in each direc- 
tion. This leads to a blurring of the clumps, since stars on iden- 
tical orbits will have different velocities throughout the sampling 
volume. For an axisymmetric galaxy with flat rotation curve, the 
size of this blurring can be estimated from conservation of energy, 
angular momentum, and vertical energy to be 



WO 



i vo d/Ro 

y/*($ ~ <t>(0) 



(35) 
(36) 

(37) 



where vq is the circular speed and &(z) vertical Galactic poten- 
tial. With Ro = 8kpc, vo = 200kms — 1 , and using model 2 from 
Dehnen & Binney (1998a) for the potential of the Milky Way in 
( p7|) this gives <r m (14, 1, 3) kms" 1 . Adding this quadratically to 
the effect of phase-mixing amplifies the expected elongation of the 
clumps in u-space. 



a factor of two. 

5.2. Moving Groups on Eccentric Orbits 

Thus, it appears natural that there is much more 
structure in (v x ,v y ) than there is in v Zl however, the 
character of this structure is very interesting. In par- 
ticular, the fact that the distribution of moving groups 
obeys an asymmetric drift relation (points 3 & 4 in Sec- 
tion [h^), similar to the smooth background: the older 
groups are more wide spread in v and lag w.r.t. the LSR, 
i.e. they are on non-circular orbits. For stars in the 
smooth background these relations arise because scatter- 
ing processes increase the velocity dispersion with age, 
and because much more stars visit us from inside the 
solar circle than from outside. The latter certainly also 
holds for the moving groups and causes the lag. How- 
ever, scattering processes would destroy and disperse an 
existing moving group rather than shifting it to another 
orbit. The question therefore arises, how did the old 
moving groups get on their eccentric orbits? 

Even tough one could imaging a scenario where the 
moving groups have been born onto these orbits - for in- 
stance, by star formation in the molecular gas ring at 
~ 4kpc, where the Galactic bar significantly distorts 
the axisymmetry - this seems unlikely in view of the 
facts that phase-mixing should have spread these mov- 
ing groups in v x and that the apparently young moving 
groups are indeed on near-circular orbits. Alternatively, 
the moving groups could have been transformed to more 
eccentric orbits. This would naturally account for the 
age-dependence of the eccentricity. Since a moving group 
is a rather fragile object, any non-axisymmetric force 
field responsible for such a process must be smooth, both 
in space (long wavelength) and time (low frequency). 
While a triaxial halo or the Magellanic clouds might play 
a role in such an process, the most obvious candidate is 
the Galactic bar. Actually, almost all the apparently 
older moving groups have negative v y , and thus low L z , 
they are near the apocenters of their orbits, which probe 
a large range of radii and come near to the bar. 

A plausible mechanism that is able to shift moving 
groups to eccentric orbits has been described by Sridhar 
& Touma (1996). Applied to our problem it works as fol- 
lows. If the otherwise axisymmetric Galaxy is perturbed 
by some non-axisymmetric component, its phase-space 
structure will change, the degeneracy will be broken, and 
resonant islands will occur, regions in phase-space sur- 
rounding resonant orbits. The orbits in these islands are 
near-resonant and oscillate around their resonant par- 
ent orbit. When the Galaxy slowly evolves, for instance, 
when the Galactic bar slows down and strengthens, these 
resonant islands sweep through phase-space, and stars 
trapped in them will be shifted to different orbits - an 
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effect also known from the dynamics of the solar system. 

Actually, there is a clear pointer to the involvement 
of resonances in the motion of nearby stars: the re- 
cent re-determination of Oort's constants from HIP- 
PARC OS Cepheids by Feast & Whitelock (1997) gives 
(in kms^k pc" 1 ) Q = A — B = 27.2 ± 0.9, K = 
2\JB 2 - AB = 36.7 ± 1.6, and H : k = 2.97 ± 0.07 : 
4. That is very precisely a 3:4 resonance between the 
azimuthal and radial orbital frequency, a ratio which 
appears naturally in a slightly declining circular-speed 
curve, e.g. of the form v c oc i? -1 / 9 . 3:4 resonant orbits 
complete four radial oscillations in the same time they 
rotate three times around the Galactic center; during 
this time they visit any given azimuth with three differ- 
ent velocities. In this respect it is intriguing that the 
stars in Bl, which are at most two orbital rotation pe- 
riods old, show just two distinct peaks in their velocity 
distribution. 

Currently, the only model that aims to explain the 
structure in (v x ,v y ) does in fact invoke a resonance, al- 
though, not the one mentioned above. This model is due 
to Kalnajs (1991), who starts from the observation of just 
two major maxima (at the Sirius and Hyades peaks), and 
locates the Sun at the outer Lindblad resonance (OLR) 
of the Galactic bar. The closed orbits inside and outside 
the OLR are elongated, respectively, perpendicular and 
parallel to the bar. At some azimuths these differently 
orientated orbits cross and naturally create a double- 
peaked distribution in (v x , v y ). This model gives pattern 
speed and projection angle of the Galactic bar in rough 
agreement with other methods, but does not account for 
the asymmetric drift relation amongst moving groups. 

6. EVIDENCE FOR THE STELLAR WARP 

In Figure ||, in all samples except Bl and best visible 
in GI and AL, the few outermost contours are somewhat 
skewed in the sense that at positive v y more stars are 
moving upwards w.r.t. the LSR than downwards. This 
is nicely illustrated in Figure ^, where the mean vertical 
motion due to f(v) inferred from the full sample is plot- 
ted versus v y . While v z is roughly constant at the LSR 
value of — 7kms~ 1 for v y < lOkms -1 (the bends are 
presumably due to the moving groups), it clearly curves 
upwards for v y J> lOkms -1 (this is not an ordinary tilt, 
which would produce a straight line in this diagram). 

This effect can be nicely explained by the Galactic 
warp. The Sun roughly lies at the inner edge of the 
warp on the line of nodes such that nearby stars that 
participate in the warp have v z > 0. For such stars 
to enter our very local sample, they must be near the 
pericenter of their orbits, and hence have v y > 0. To 
be more quantitative, let me model the height z of the 



Galactic disk as a function of galacto-centric distance R 
and azimuth <p (with ip = at the Sun) by 



z = h(R) sm((p + ilpt), 



(38) 



where fl p is the pattern speed with Q p > meaning a ret- 
rograde motion w.r.t. the stellar orbits, as expected from 
almost all theoretical warp models. I take the height 
function 



h{R) = (R-r w ) 2 /r hl 



(39) 



where r w is the edge of the flat part of the disk and 
then sets the amplitude of the warp. For a star with 
guiding center i?, we expect from the conservation of 
angular momentum 



R 

Vy = —n(R) - R n(R ) - v Gy 



(40) 



and a mean vertical motion of v z — v z {R) — v z (Rq) — Vq z 
with (using ip = fit) 



v z (R) = — = (n(R) + n p ) h(R) cos<p, 



(41) 



where £l(R) is the circular frequency and v Q = (10.0, 5.2, 
7.2) kms -1 is the solar motion w.r.t. the LSR as derived 
in paper I. The lines in Figure ^ have been computed as- 
suming a slightly falling rotation curve of the form Rfl cx 
fl- 1 / 9 , R = 8kpc, an d Q(Ro) = 27.2 km s" 1 kpc -1 
(Feast & Whitelock 1997). The dashed line is for a model 
with r w — 6.5 kpc, r/j = 15kpc, given for the Hi by 
Diplas & Savage (1990), and f2 p = 0. Note that, for the 
rotation curve adopted, v y = 40kms~ 1 originates from 




Fig. 6. — The mean vertical motion, v z , measured from sample 
AL as a function of of v y . The errors bars show 1 cr errors due to 
Poisson noise; however, adjacent points are not independent but 
coupled by the smoothing. The triangle indicates the velocity of 
the LSR, and the lines are derived from models for the Galactic 
warp, see text. 
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a guiding center of R = 10 kpc. Clearly, this model does 
not fit the data and non-zero Q p would make it even 
worse]]. The obvious reason for this failure is the fact 
that the modeled warp starts well inside the solar cir- 
cle, and predicts a strong effect already near the LSR. 
The full (almost indistinguishable) lines show four mod- 
els with an inner edge of the warp of r w = Rq, various 
pattern speeds in the range < fl p < 30 km s -1 kpc -1 , 
and with rt chosen such that the pattern speed and the 
amplitude z\q of the warp at R = 10 kpc obey 



£2. 



km s 1 kpc 



4-6(zi /kpc) 
0.3(zio/kpc) 



(42) 



These models give good descriptions of the data at all 
valid points. Thus, to explain the observed effect, the 
Galactic warp must not start inside the solar circle. On 
the other hand, only a combination of the amplitude 
of the warp and its pattern speed is constraint by the 
data. With the above parameters for the Galactic rota- 
tion curve and i?n, the Hi data of Burton (1992) yield a 
warp amplitude Zio of 0.3 to 0.4 kpc, resulting in fl p of 13 
to 25kms _1 kpc -1 . The infrared data taken by DIRBE 
suggest that the stellar warp might have half that ampli- 
tude ( Freudenreich et al. 1994), in which case Vl p must 
be larger. 

7. CONCLUSION 

I have analyzed a kinematically unbiased sample of 
more than 14000 nearby stars with positions, parallaxes, 
and proper motions known accurately from ESA's as- 
trometric mission 'HIPPARCOS'. From the same sam- 



ple, we have re-determined in paper I (Dehncn & Binney 
|!998b| ), as a function of color, the mean velocity and 
velocity dispersion for main-sequence stars in the solar 
neighborhood. In this paper, the velocity distribution 
f{v) itself, rather than its first two moments, has been 
inferred. 

Although radial velocities are available for a fraction 
of these stars, they could not be used because they are 
predominantly known for high-velocity stars and would 
inevitably introduce a kinematic bias. Without knowl- 
edge of radial velocities, the inference of f(v) is for- 
mally equivalent to a deprojection and an example of an 
astronomical inverse problem. A maximum-penalized- 
likelihood (MPL) technique for the recovery of f(v) from 
the positions and tangential velocities of individual stars 
has been developed. 



7 Of course, I made several simplifications, for instance, assuming 
that all stars are precisely at R = Ro- However, this is unlikely to 
alter the conclusions. A more appropriate analysis, e.g. a Monte 
Carlo simulation, is beyond the scope of this paper. 



Poisson noise completely dominates the errors, and 
the number of stars is too small to recover the three- 
dimensional distribution with useful resolution, but any 
two-dimensional projection can be determined with rea- 
sonable accuracy. The resulting distributions of nearby 
stars in (v x ,v y ), (v x ,v z ), and (v y ,V z ) are shown in Fig- 
ures [| and [| respectively. (v x , v y , and v z denote 
the velocity components in the directions of the Galactic 
center, Galactic rotation, and the north Galactic pole, 
respectively). The sample has been split into four color 
bins (Bl to B4) of main-sequence stars and non-main- 
sequence stars (GI), mainly giants. Additionally, the 
sample has been analyzed as a whole (AL). 

7.1. The Smooth Background 

At large velocities and/or redder color most stars are 
in a smooth and approximately ellipsoidal background 
distribution. The axis ratio of this background distri- 
bution increases from early to late stellar types. The 
increase is stronger in v z than it is in the horizontal mo- 
tions, implying that the vertical heating becomes rela- 
tively more important for on average older and hence dy- 
namically hotter stellar populations. This is expected if 
scattering by spiral structure is important, since for stars 
with epicycle diameters larger than the inter- arm sepa- 
ration, this process becomes inefficient ( penkins 1992 ). 
From Figure ^, sample AL, I estimate that the effect 
sets in at \v x \ of ~ 60kms -1 corresponding to an epicy- 
cle diameter of ~ 3kpc (with k = 36.7kms _1 kpc" 1 ). 

The background distribution also shows nicely the 
asymmetric drift relation: for ever later stellar types, the 
outermost contours in Figs. |^ to || are larger and their 
centroid is shifted more and more to negative v y . These 
contours appear to be aligned with the v x axis (radial) 
but slightly skewed w.r.t. the v y and v z axes. The lat- 
ter can nicely be explained by the Galactic warp: orbits 
participating in it move upwards and must have v y > 
in order to enter our local sample. A more quantita- 
tive analysis reveals that the warp must not start within 
the solar circle and has pattern speed ;> 13kms _1 kpc -1 
with sense of rotation opposite to the stellar orbits, as is 
theoretically expected. 

7.2. The Moving Groups 

Apart from the smooth background, f(v) shows a 
lot of structure, in particular in the horizontal motions 
(Fig. ||): there are various major and minor maxima and 
smaller features that seem to be real, in as much as they 
appear in more than one of the distinct subsamples. A 
peak in f(v) corresponds to a group of stars moving 
with the same velocity, and indeed, many maxima corre- 
spond to one of the moving groups already identified by 
Kapteyn (1905) and later studied by Eggen (e.g. 1965). 
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The fact that these moving groups are much less visible 
in v z can be attributed to phase-mixing. What seems to 
be surprising, however, is that the distribution of mov- 
ing groups obeys an asymmetric drift relation, similar 
to the smooth background: the apparently older groups 
(as judged from their minimum color) are more wide 
spread in (v x , v y ) and lag w.r.t. the LSR, i.e. they are on 
non-circular orbits. A possible explanation for a moving 
groups on such an eccentric orbits is as follows. A cluster 
of stars is born onto a near-circular orbit similar to those 
of the molecular gas. The stars might then be trapped 
into a resonance with a non-axisymmetric force field like 
that of the Galactic bar. If this force field slowly evolves, 
the resonances are shifted in orbit space and with them 
the stars trapped. This mechanism is well known from 
solar system dynamics and has recently been proposed 
to explain the thick disk ( |Sridhar fc Touma 199G| ). 

To interpret properly the results of this paper, i.e. to 
answer the question why f(v) shows the observed struc- 
ture, we clearly need a better understanding of the pro- 
cesses potentially involved in the dynamical evolution of 
the solar neighborhood, in particular, with regard to the 
effect of orbital resonances. These processes include (i) 
phase-mixing; (ii) scattering by small random pertur- 
bations of the Galactic force field, like giant molecular 
clouds and temporary spiral arms; (iii) interactions with 
large transient perturbers, for example, merging satel- 
lites of the Milky Way, such as the Saggritarius dwarf 
galaxy; and (iv) forcing by regular non-axisymmetric 
components of the gravitational potential, like the cen- 
tral bar, a long-living spiral pattern, a triaxial halo, or 
the Magellanic clouds. For the processes (i), (iii), and 
(iv), resonances in the stellar motions and/or between 
these and the motions of the perturbing agents are likely 
to be important and may lead to a behavior completely 
different from that of the non-resonant case. 
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